feat(core): add user transcription timeout by chenghao-mou · Pull Request #6182 · livekit/agents

chenghao-mou · 2026-06-22T13:36:02Z

add a new event when vad detects any speech, but stt fails to produce any transcript during that timeout

from livekit.agents import AgentSession, UserTranscriptionTimeoutEvent

  session = AgentSession(
      stt=...,
      llm=...,
      tts=...,
      # VAD heard speech but no transcript landed within 5s of the user stopping.
      # 5.0 is the default; set to None to disable.
      transcription_timeout=5.0,
  )

  @session.on("user_transcription_timeout")
  def _on_transcription_timeout(ev: UserTranscriptionTimeoutEvent) -> None:
      # ev.speech_duration   -> total VAD speech (s) this turn that produced no transcript
      # ev.vad_speech_started_at -> when the user first started speaking (epoch seconds)

      # ignore very short blips (coughs, door slams) that VAD picks up as "speech"
      if ev.speech_duration < 0.7:
          return

      session.generate_reply(
          instructions="Tell the user you didn't catch that and ask them to repeat it.",
      )

add a new event when vad detects any speech, but stt fails to produce any transcript during that timeout

devin-ai-integration

Devin Review found 2 potential issues.

devin-ai-integration · 2026-06-22T13:38:40Z

🚩 New event not forwarded in SessionHost remote transport

The SessionHost in remote_session.py registers handlers for specific events (lines 371-379) and forwards them over the transport. The new user_transcription_timeout event is not registered or forwarded. This means remote sessions won't receive this event. This is likely acceptable as a first iteration (not all events need remote transport support immediately), but it's an inconsistency with the event being in EventTypes and AgentEvent.

(Refers to lines 366-379)

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-06-22T13:38:41Z

+    def _on_transcription_timeout(self) -> None:
+        self._transcription_timeout_handle = None
+        if self._user_turn_start is None or self._turn_transcript_received:
+            return
+
+        if self._agent_speaking:
+            return
+
+        self._hooks.on_transcription_timeout(
+            speech_duration=self._turn_speech_duration, turn_start=self._user_turn_start
+        )


🚩 Transcription timeout suppressed during agent speech with no re-arm

When _on_transcription_timeout fires while the agent is speaking (audio_recognition.py:1809), the callback returns silently without emitting the event and without scheduling a retry. This means if VAD detects user speech that produces no transcript, and the timeout happens to fire while the agent is still speaking (e.g. the user spoke over the agent near the end of its turn), the timeout event is permanently lost for that speech burst. The rationale is likely that speech detected during agent output is often echo/noise, but with AEC enabled it could be genuine user speech. Whether this is acceptable depends on the use case.

Was this helpful? React with 👍 or 👎 to provide feedback.

Since there is no user content, we should still emit the signal so the agent can check.

longcw · 2026-06-24T05:51:21Z

            self._user_speaking_event.clear()
            self._last_speaking_time = time.time() - ev.silence_duration - ev.inference_duration

+            self._arm_transcription_timeout(ev.speech_duration)


maybe skip this when stt is not set?

davidzhao · 2026-06-24T05:54:53Z

should we have a default handler with the above implementation? any downsides of having this work automatically?

longcw · 2026-06-24T06:01:22Z

should we have a default handler with the above implementation? any downsides of having this work automatically?

I am wondering how many false alarms it may have, it can by risky if there are some background noise.

feat(core): add user transcription timeout

0aa24fb

add a new event when vad detects any speech, but stt fails to produce any transcript during that timeout

chenghao-mou requested a review from a team as a code owner June 22, 2026 13:36

devin-ai-integration Bot reviewed Jun 22, 2026

View reviewed changes

longcw reviewed Jun 24, 2026

View reviewed changes

chenghao-mou added 3 commits June 24, 2026 15:13

gate the timer behind stt check

7851c71

use stt pipeline instead of stt

98e470e

fix tests and reformat

f50d659

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(core): add user transcription timeout#6182

feat(core): add user transcription timeout#6182
chenghao-mou wants to merge 4 commits into
mainfrom
chenghao/feat/stt-transcription-timeout-AGT-3024

chenghao-mou commented Jun 22, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Jun 22, 2026

Uh oh!

devin-ai-integration Bot Jun 22, 2026 •

edited

Loading

Uh oh!

chenghao-mou Jun 22, 2026

Uh oh!

longcw Jun 24, 2026

Uh oh!

davidzhao commented Jun 24, 2026

Uh oh!

longcw commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

chenghao-mou commented Jun 22, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chenghao-mou Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

longcw Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

davidzhao commented Jun 24, 2026

Uh oh!

longcw commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

devin-ai-integration Bot Jun 22, 2026 •

edited

Loading